Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

several caprevoke optimizations #2195

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

markjdb
Copy link
Contributor

@markjdb markjdb commented Aug 14, 2024

The procstat-related patches should perhaps be upstreamed as well, though procstat vm doesn't have a verbose mode in vanilla FreeBSD.

SYSCTL_COUNTER_U64(_vm_stats_cheri_revoke, OID_AUTO, skip_obj_no_hascap, CTLFLAG_RD,
&cheri_skip_obj_no_hascap,
"Virtual pages skipped in VM objects with no capabilities");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason to prefer this over the caprevoke stats infrastructure? (Though admittedly I'd not be sad to see the latter get ripped out.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly that I wanted to have a global view of the counter value, and sysctl is more convenient for that. The existing stats infrastructure collects per-process stats and I believe is limited to printing them when the process exits. The right direction might be a hybrid scheme wherein we maintain per-process (really, per-vmspace) and global counters, and use the former to update the latter after each scan.

goto fini;
if ((entry->max_protection & VM_PROT_READ_CAP) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message should probably say "have PROT_READ_CAP clear"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but the commit log is not very clear. I meant that the existing check (max_prot & PROT_READ_CAP) == 0 does not exclude, e.g., mappings of executable ELF file segments, so without this change we end up scanning those pages unnecessarily.

Copy link
Member

@brooksdavis brooksdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this all a light skim and it generally seems good. The second commit message makes sense now that I understand what's going on, but spelling out more from the big block comment is probably worthwhile.

sys/sys/user.h Outdated
@@ -576,7 +576,8 @@ struct kinfo_vmentry {
} kve_type_spec;
uint64_t kve_vn_rdev; /* Device id if device. */
uint64_t kve_reservation; /* Map reservation */
int _kve_ispare[6]; /* Space for more stuff. */
int kve_max_protection; /* Max protection bitmask. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should upstream kve_max_protection and swap them around in CheriBSD before the next release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markjdb and others added 3 commits October 9, 2024 12:21
If, when scanning, we reach a VA that is unmapped and corresponds to a
non-resident page, and the VM object is swap-backed and has no swap blocks
assigned, we skip to the next resident page.

Extend this optimization to the case where the object has swap blocks assigned:
the in-memory tracking structure for a swap block includes a bitmap of all the
capability tags, so when skipping we can search for the closer of
- the next resident page
- the next page backed by a swap block that has at least one capability tag.
This skips over vnode mappings that have PROT_READ_CAP set in max_prot
and thus avoids a lot of needless scanning.

Add some counters so that we can see the relative effectiveness of different
checks in reducing the number of scanned virtual pages.
This avoids some needless work in cases where a short-lived process triggers
async revocation.
@markjdb markjdb force-pushed the dev-caprevoke-optimizations branch from f4b1b61 to f0dce4c Compare October 9, 2024 12:21
@markjdb
Copy link
Contributor Author

markjdb commented Oct 9, 2024

I dropped the procstat changes since those are coming from upstream and aren't critical to the rest of the PR.

@@ -192,15 +199,14 @@ vm_cheri_revoke_kproc(void *arg __unused)
/*
* Do the actual revocation pass.
*/
error = vm_cheri_revoke_pass_locked(&arc->cookie);
error = vm_cheri_revoke_pass_locked(arc->vm, &arc->cookie);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a bit weird to use this after it was seemingly free'd above. It might be clearer to the reader with a comment and a local vm temporary like so:

     vmspace_switch_aio(&arc->vm);
     vm = arc->vm;

     /*
      * Drop arc's reference on the vmspace.  The scanner kproc holds a reference via
      * its p_vmspace pointer while the scanning the address space.  Dropping the
      * reference here permits the scan to return early if target process exits early
      * leaving the scanner kproc's reference as the only reference.
      */
     vmspace_free(arc->vm);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants